PHYSICAL
DATA
ORGANIZATION FUNDAMENTALS. SPACE SEARCH ALGORITHM PART-1
When a SQL statement inserts a new row into a table how does DB2
determine the page into which the row will be accommodated? Why do we
need to know about how DB2 searches for free space? How can we find out
how effective the space searches have been for a given table space?
This article is a primer before we move on to the space search
algorithms. A clear understanding of clustering
is needed before proceeding to space search algorithms.
If you are thorough with the concept of clustering, you can move on to
the next article in this series.
How does DB2
determine the physical location of a new record that might be
inserted?
To put it very simply.
In case of a table space DB2 :
(a) Tries to put the newly inserted row in that page of the table space
where the clustering is maintained if the table has a clustering index.
(b) Use any already existing empty space before allocating new space.
In case of an index DB2:
(a) Guarantee key sequence
(b) Use any already existing empty space before allocating new space.
IMPORTANT NOTE: I am not considering MEMBER CLUSTER option that is used
to reduce space map contention in data sharing environments.
About clustering, If the physical sequence of data rows in
the table space pages closely follow the logical
sequence of index entries (of the clustering index ) then the
data in the table space is said to be highly clustered. It must
be remembered that the index entries, no matter how disorganized they
might be physically are always in logical order.
Let us see what I mean by this.
Consider the index shown below. It shows a simple
index that is defined on PHONE_NUMBER ASCENDING. If you look at
the
the physical structuring of the index, the first set of entries of the
index are actually stored in the last page, the next set of entries are
stored in the first page and the last set of entries are stored some
where in the middle. In spite of this when ever you access the index in
sequence, you are GUARANTEED to access the LOGICAL PAGE 1 first,
immaterial of whether it is the 1st physical page or the 100 th
physical page. This is made possible by pointers that are
maintained by DB2's B-Tree structure that always points you to the
correct page and pointers between index pages.

Unlike an index , table spaces do not maintain pointers that maintain
logical order. So if DB2 stores row in a random order in
the table space pages and you access the table space without the index,
you will get the data in the same random order. If the cluster ratio of
the table space is 100 % then you will get the data in the sequential
logical order even without using the index.

Consider a perfectly clustered table space as shown above. The Physical
sequence of the records are in perfect match with the logical sequence
of the index.
Now assume that a new record with a telephone number of
210-440-1212 is being inserted into the table, in order for DB2
to maintain clustering, it needs to be inserted some where between PAGE
1 and PAGE 2 of the table space. If DB2 is unable to find free space
between PAGE1 and PAGE2 it has to insert this row in an other page that
is not in sequence. At this point the Cluster ratio of this table space
will fall from the perfect 100 %.
On the other hand in the the index, no matter where the
index entry goes physically, DB2 will update the pointers to include
the new entry at the correct position in the sequence. So the concept
of clustering is non-existent for indexes.
I will be dealing with space search algorithms specific to table spaces
and indexes, Significance of OFFPOS and INDREF measures in my
subsequent articles.